**metaknowledge**
*NetLab, University of Waterloo*
Reid McIlroy-Young, John McLevey, and Jillian Anderson

Getting Set Up

The very first time you use this jupyter notebook you will need to run the cell directly below. Do not run the cell the next time you use this jupyter notebook. If you do, nothing bad will happen, it just isn't neccessary.


In [ ]:
# Only run this the VERY first time
!pip install metaknowledge
!pip install networkx
!pip install pandas
!pip install python-louvain

In [56]:
# Run this before you do anything else
import metaknowledge as mk
import networkx as nx
import pandas
import community
import webbrowser

In [57]:
# The line below is the most important line in the entire document. 
# Make sure the filepath is set to the location where the WOS file is stored.
inputFile = "/Users/jilliananderson/Downloads/imetrics"

Networks

Define Variables

Next, we need to define some variables:

  • filepath should be set as the filepath to your isi file.
  • networkType should be "CoCitation", "CoAuthor", or "Citation".
  • nodeType must be set to one of "full", "original", "author", "journal", or "year".

In [60]:
networkType = "CoAuthor"
nodeType = "original"

Make Network


In [ ]:
# This cell creates the network based on 
# the variables you provided above.
RC = mk.RecordCollection(inputFile)

if networkType == "CoCitation":
    net = RC.networkCoCitation(nodeType = nodeType, coreOnly=True)
    directed = False
    partition = community.best_partition(net)
elif networkType == "CoAuthor":
    net = RC.networkCoAuthor()
    directed = False
    partition = community.best_partition(net)
elif networkType == "Citation":
    net = RC.networkCitation(nodeType=nodeType, coreOnly=True)
    directed = True

else:
    print("Please ensure networkType has been set to one of the accepted values")

In [ ]:
# This code detects centrality measures for your network.
betweenness = nx.betweenness_centrality(net)
# closeness = nx.closeness_centrality(net)   # <- Extra more complicated centrality
# eigenVect = nx.eigenvector_centrality(net) # <--/

In [62]:
for n in net.nodes():
    if not directed:
        betw = round(betweenness[n], 3)
        comm = partition[n]
        net.add_node(n, community=comm, betweenness=betw)
    else:
        betw = round(betweenness[n], 3)
        net.add_node(n, community=comm, betweenness=betw)


---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-62-53e3650a46ba> in <module>()
      1 # This code detects centrality measures for your network.
----> 2 betweenness = nx.betweenness_centrality(net)
      3 # closeness = nx.closeness_centrality(net)   # <- Extra more complicated centrality
      4 # eigenVect = nx.eigenvector_centrality(net) # <--/
      5 

/Users/jilliananderson/anaconda/lib/python3.5/site-packages/networkx/algorithms/centrality/betweenness.py in betweenness_centrality(G, k, normalized, weight, endpoints, seed)
    114         # single source shortest paths
    115         if weight is None:  # use BFS
--> 116             S, P, sigma = _single_source_shortest_path_basic(G, s)
    117         else:  # use Dijkstra's algorithm
    118             S, P, sigma = _single_source_dijkstra_path_basic(G, s, weight)

/Users/jilliananderson/anaconda/lib/python3.5/site-packages/networkx/algorithms/centrality/betweenness.py in _single_source_shortest_path_basic(G, s)
    229     P = {}
    230     for v in G:
--> 231         P[v] = []
    232     sigma = dict.fromkeys(G, 0.0)    # sigma[v]=0 for v in G
    233     D = {}

KeyboardInterrupt: 

In [ ]:
# This code writes two .csv files to your computer.
# One is the edgeList and the other is the node Attribute file
mk.writeGraph(net, "myNet")

Writing the HTML file

To display our network, we need to make the file which displays it.


In [53]:
%%writefile network.html
<!DOCTYPE html>
<head>
    <meta charset="utf-8">
    <title>Network</title>
    <link rel="stylesheet" href="http://networkslab.org/mkD3/styles.css">
    <script src="https://d3js.org/d3.v4.js"></script>
    <script src="http://networkslab.org/mkD3/mkd3.js"></script>
</head>
<body>
    <script type = "text/javascript">
        mkd3.networkGraph("myNet_edgeList.csv", "myNet_nodeAttributes.csv")
    </script>
</body>


Overwriting network.html

Display the Network

Running the next cell


In [54]:
url = 'http://localhost:8888/files/network.html'
webbrowser.open(url)


Out[54]:
True

RPYS Visualization


In [41]:
minYear = 1900
maxYear = 2016
rpysType = "StandardBar"

Standard RPYS


In [42]:
RC = mk.RecordCollection(inputFile)

rpys = RC.rpys(minYear=1900, maxYear=2016)
df = pandas.DataFrame.from_dict(rpys)
df.to_csv("standard_rpys.csv")

# Creating CitationFile
citations = RC.getCitations()
df = pandas.DataFrame.from_dict(citations)
df.to_csv("standard_citation.csv")

In [43]:
%%writefile standardBar.html
<!DOCTYPE html>
<head>
    <meta charset="utf-8">
    <title>Title Here</title>
    <link rel="stylesheet" href="http://networkslab.org/mkD3/styles.css">
    <script src="https://d3js.org/d3.v4.js"></script>
    <script src="http://networkslab.org/mkD3/mkd3.js"></script>
</head>
<body>
    <script type = "text/javascript">
        mkd3.standardBar("standard_rpys.csv", "standard_citation.csv")
    </script>
</body>


Overwriting standardBar.html

In [44]:
url = 'http://localhost:8888/files/standardBar.html'
webbrowser.open(url)


Out[44]:
True

In [45]:
%%writefile standardLine.html
<!DOCTYPE html>
<head>
    <meta charset="utf-8">
    <title>Title Here</title>
    <link rel="stylesheet" href="http://networkslab.org/mkD3/styles.css">
    <script src="https://d3js.org/d3.v4.js"></script>
    <script src="http://networkslab.org/mkD3/mkd3.js"></script>
</head>
<body>
    <script type = "text/javascript">
        mkd3.standardLine("standard_rpys.csv", "standard_citation.csv")
    </script>
</body>


Overwriting standardLine.html

In [46]:
url = 'http://localhost:8888/files/standardLine.html'
webbrowser.open(url)


Out[46]:
True

Multi RPYS


In [47]:
years = range(minYear, maxYear+1)
RC = mk.RecordCollection(inputFile)

# ***************************
#  Create the multiRPYS file
# ***************************
dictionary = {'CPY': [],
             "abs-deviation": [],
             "num-cites": [],
             "rank": [],
             "RPY": []}
for i in years:
    try:
        RCyear = RC.yearSplit(i, i)
        if len(RCyear) > 0:
            rpys = RCyear.rpys(minYear=1900, maxYear=maxYear)
            length = len(rpys['year'])
            rpys['CPY'] = [i]*length

            dictionary['CPY'] += rpys['CPY']
            dictionary['abs-deviation'] += rpys['abs-deviation']
            dictionary['num-cites'] += rpys['count']
            dictionary['rank'] += rpys['rank']
            dictionary['RPY'] += rpys['year']
    except:
        pass

df = pandas.DataFrame.from_dict(dictionary)
df.to_csv("multi_rpys.csv")


# ***************************
#  Create the citation file
# ***************************
dictionary = {"author": [],
              "journal": [],
              "num-cites": [],
              "RPY": [],
              "CPY": []}

for i in years:
    try:
        RCyear = RC.yearSplit(i, i)
        if len(RCyear) > 0:
            citations = RCyear.getCitations()
            length = len(citations['year'])
            citations['CPY'] = [i]*length

            dictionary['CPY'] += citations['CPY']
            dictionary['author'] += citations['author']
            dictionary['journal'] += citations['journal']
            dictionary['num-cites'] += citations['num-cites']
            dictionary['RPY'] += citations['year']
    except:
        pass

df = pandas.DataFrame.from_dict(dictionary)

df.to_csv("multi_citation.csv")

In [48]:
%%writefile multiRPYS.html
<!DOCTYPE html>
<head>
    <meta charset="utf-8">
    <title>Title Here</title>
    <link rel="stylesheet" href="http://networkslab.org/mkD3/styles.css">
    <script src="https://d3js.org/d3.v4.js"></script>
    <script src="http://networkslab.org/mkD3/mkd3.js"></script>
</head>
<body>
    <script type = "text/javascript">
        mkd3.multiRPYS("multi_rpys.csv", "multi_citation.csv")
    </script>
</body>


Overwriting multiRPYS.html

In [49]:
url = 'http://localhost:8888/files/multiRPYS.html'
webbrowser.open(url)


Out[49]:
True

In [ ]: